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Abstract 

The notion of generalised exponential family is considered in the 
restricted context of nonextensive statistical physics. Examples are 
given of models belonging to this family. In particular, the q-Gaussians 
are discussed and it is shown that the configurational probability dis- 
tributions of the microcanonical ensemble belong to the q-exponential 
family. 
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1 Introduction 

In statistics, a model is a probability distribution which depends on a number 
of parameters. With this definition, statistical physics is plenty of models. 
Typical parameters are the total energy U in the microcanonical ensemble, 
or the inverse temperature j3 in the canonical ensemble. This parameter 
dependence is important to understand why certain models belong to the 
exponential family and others do not. In particular, all models described by 
a Boltzmann-Gibbs distribution 

= 4^^M-PH{x)) = c{x)exp{-\nZ{(])-(]H{x)), (1) 
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where H{x) is the Hamihonian of the system and Z{f3) is the normahsation, 
belong to the exponential family because they have the right dependence on 
the inverse temperature f3. 

Recently, the notion of the exponential family has been generalised by the 
present author in a series of papers EH ESI |33]. The same definition of 
generalised exponential family has also been introduced in the mathematics 
literature [121 EDI E21 E3] • This class of models was also derived using the 
maximum entropy principle in [I5l EH] ■ 

Many but not all of the models of non-extensive statistical physics [3], [IB] 
belong to the generalised exponential family. They are obtained by replacing 
in ([1]) the exponential function by a g-deformed exponential function [13] 
— see the next Section. An important question is then whether in the modi- 
fication the normalisation should stand in front of the deformed exponential 
function, or whether it should be included as lnZ(/3) inside. From the gen- 
eral formalism mentioned above it follows that the latter is the right way to 
go. It is the intention of the present paper to give examples and to show 
how the generalised formalism looks like when restricted to the context of 
non-extensive statistical physics. 

The next Sections recall the definition of deformed logarithmic and ex- 
ponential functions and introduce the notion of the g-exponential family. In 
Section 4, a number of physically relevant examples are discussed. Sections 
5 to 8 give the proof of the variational principle. Sections 9 to 13 discuss the 
geometrical structure behind the g-exponential family. In Section 14 some 
final remarks are made. The short appendix contains a table with often used 
formulas. 



2 Deformed logarithmic and exponential func- 
tions 

The g-deformed logarithm was introduced in [10] . It is defined by 

\ng{u) = {u'-" - 1) , u>0. (2) 

Its first derivative is 

^\n,{u) = - (3) 

This derivative is positive for any value of g. Hence, the deformed logarithm 
is always a strictly increasing function — this is important in the sequel. In 
the limit g = 1 the deformed logarithm reduces to the natural logarithm \nu. 
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The inverse function is the deformed exponential function 

exp,(«) = [1 + (1 - q)u]f'-'^^ . (4) 

The notation [u]+ = max{0,M} is used. One has < expg(-u) < +oo for all 
u. For g 7^ 1 the range of lng{u) is not the full line. By putting expg{u) = 
when u is below the range of \iag{u), and equal to +oo when it is above, 
expg(-u) is an increasing function of u, defined for all values of u. 

3 The ^-exponential family 

Some interesting models of statistical physics can be written into the follow- 
ing form 

Ux) = c{x)expgi-a{P)-pH{x)). (5) 

If the g-exponential in the r.h.s. diverges then fpix) = is assumed. The 
function H{x) is the Hamiltonian. The parameter (3 is usually the inverse 
temperature. The normalisation a{(3) is written inside the g-exponential. 
The function c{x) is the prior distribution. It is a reference measure and 
must not depend on the parameter (3. 

If a model is of the above form then it is said to belong to the g-exponential 
family. In the limit g = 1 these are the models of the standard exponential 
family. In that case the expression ([5]) reduces to 

fp{x) = c{x) exp {-a{(3) - (3H{x)) , (6) 

which is known as the Boltzmann-Gibbs distribution. 

The convention that f/six) = when the r.h.s. of diverges may seem 
weird. However, one can argue [33] that this is the right thing to do. Also, 
the example of the harmonic oscillator, given below, will clarify this point. A 
reformulation of ([5]) is therefore (See Theorem 2 of [33]) that either f/3{x) = 
or 

ln,(^^^=-a{P)-(3H{x). (7) 

The g-exponential family is a special case of the generalised exponential 
family introduced in [T9l [201 [33] . Models belonging to such a family share a 
number of interesting properties. In particular, they all fit into the thermo- 
dynamic formalism. As a consequence, the probability density fpix) may be 
considered to be the equilibrium distribution of the model at the given value 
of the parameter f3. 
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Figure 1: g-Gaussians for cr = 1 and q = -^^q = l,q = 2. 



4 Examples 

4.1 The g-Gaussian distribution {q < 3) 

Many of the distributions encountered in the hterature on nonextensive 
statistics (see for instance [TH]) can be brought into the form A promi- 
nent model encountered in this context is the g-Gaussian distribution (see 
for instance [22l EH [3l] and [25l EH ETj) 



f(x) = exp (-xVc^^), 
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with 




1 < g < 3, 




(9) 



It can be brought into the form ^ with c(x) = l/cg, iJ(x) = x^, /3 = cx' 
and 



1 



- 1 



hl2_q((T). 



(10) 



Take for instance g = 1/2. Then ([H]) becomes 



15^2 
32(T 



1-^ 



'11^ 



Note that this distribution vanishes outside the interval [—a, cr] . The q = 1- 
case reproduces the conventional Gauss distribution. For q = 2 one obtains 



fix) 



1 



(12) 



This is known as the Cauchy distribution. The function ( |T2i) is also called a 
Lorentzian. In the range 1 < g < 3 the g-Gaussian is strictly positive on the 
whole line. For g < 1 it is zero outside an interval. For g > 3 the distribution 
cannot be normalised because 



fix) 



|xp/('?-i) 



oo. 



(13) 



4.2 Kappa-distributions 1 < g < | 

The following distribution is known in plasma physics as the kappa-distribution 
— see for instance [6] 



fiv) 



A{k)vI 



1 + 



1 1)2 



1 + K • 



(14) 



This distribution is a modification of the Maxwell distribution, exhibiting a 



power law decay like f{v) 



-2ft 



for large v. 



Expression (1141) can be written in the form of a g-exponential with g 
1 + TT^ and 



fiv) 



exp„ 



(g- l)af^ 



(15) 



However, in order to be of the form , the pref actor of ( fT5|) should not 
depend on the parameter vq. Introduce an arbitrary constant c > with the 
dimensions of a velocity. Then one can write 



fiv) 



Attv^ 



exp„ 



In 



2-q 



V c 



g - (g - l)av'^ 



(16) 
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Figure 2: Speed distribution for a harmonic oscillator with vq = 1. 



This is now of the form ([5]) with prior distribution c{v) = Atcv'^/c^ and Hamil- 
tonian H{v) = |mf ^. The inverse temperature P is given by 

p = (i.AM (^)T' (17) 

\ \cJ ) 2 — q — [q — l)amvQ 

In the q = 1-limit one obtains the Maxwell distribution with (3 = 2/mfQ, as it 
should be. If g > 1 then the inverse temperature (3 depends on the choice of 
the arbitrary constant c, while the distribution function does not depend on 
c. This is rather disturbing since it means that a fit of ffT^ to experimental 
data does not result in an absolute value for the inverse temperature p. 
The inequality k > | is required to make f{v) normalisable. This implies 

that 1 < g < |. From /5 ~ vl~^^''~^^ then follows that /3 is a monotonically 
decreasing function of the average velocity vq, as it should be. 

4.3 Speed of the harmonic oscillator {q = 3) 

The distribution of velocities f of a classical harmonic oscillator is given by 

It diverges when |f | approaches its maximal value Vq and vanishes for |f | > vq. 
See the Figure El This distribution can be written into the form (I5l) of a g- 
exponential family with q = 3. To do so, let x = f and 

c[x) = —-, 
n\v\ 
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R 1 2 

= -| (19) 

Remark that in this example the roles of the inverse temperature and the 
Hamiltonian iJ(;X') are interchanged. The parameter (3 in this model is the 

total energy |m?;Q of the harmonic oscillator. The stochastic variable i7, 
used to estimate the total energy, is the inverse of the kinetic energy K. 
Note however that the average of the latter diverges. 

4.4 Configurational density distribution {q = —1) 

The harmonic oscillator is a very special example because its density of states 

p{U) = ^JdqJ dp5{H{q,p) - U) (20) 

is constant and because it is quadratic in both the position and the momen- 
tum variables, because of the latter property the role of the kinetic energy 
and the potential energy can be interchanged. This is what is done below. 

Consider a. d = 3-classical particle with mass m in a potential V(q). The 
Hamiltonian is 

^^(q,p) = ^|pr + nq)- (21) 
The microcanonical probability distribution equals 

/c/(q,p) = ^<^(^(q,p)-C/)- (22) 

The configurational density distribution is obtained by integrating out the 
momenta 



1 1 



p{U) 2h 
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with c = 2\/2Txm^/m/h. Note that fu{^) is now in the form with x = q, 
c{x) = c, and 

= -\+f3U{f3), (25) 

if(x) = ^^(q). (26) 

Hence, the probabihty distribution of the position q of the particle belongs 
to the g-exponential family with q = —1. The correct interpretation of this 
result is that the measured values of 

{V)p = J dq/^(q)V(q) (27) 

can be used to estimate the parameter f3. The latter determines the origi- 
nal parameter of the microcanonical model, which is the total energy U{(3). 
Indeed, assume that the density of states p{U) is a strictly increasing func- 
tion of U. This is for instance the case when V^(q) ~ |qp, which implies 
p(U) ~ f/^. Then the function p{U) can be inverted and the knowledge of j3 
uniquely determines the total energy U . 



5 The variational principle 

An important argument justifying the statement that the model distributions 
([5]) exhibit statistical equilibrium is that they formally satisfy a maximum 
entropy principle. 

It is known since long pQ that the probability distributions of a model 
belonging to the exponential family satisfy not only the maximum entropy 
principle, but also a stronger statement, which is known in the mathematical 
physics literature as the variational principle. In physical terms this principle 
states that the free energy is minimal in equilibrium. 

The thermodynamic definition of free energy is F = U — TS, where U 
is the average energy (H), S is the entropy, and T is the temperature (the 
inverse of f3 when units are taken so that the Boltzmann constant equals 1). 
It is slightly more convenient to maximise $ = S — f3U instead of minimising 
F. This function $ is known as Massieu's function. 

In what follows it is shown that the model distributions satisfy a 
generalised version of the variational principle. In the cases that the average 
energy U diverges the variational principle is satisfied only at the level of the 
microstates. 
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6 Choice of the entropy function 

A general form of entropy function /(/) is ^151 121 EHl EOl [33] 



(28) 



with 




dv k{v) + A. 



(29) 



The function A(f ) should be a strictly increasing function for /(/) to be an 
entropy function. It may be interpreted as a deformed logarithm. Hence, it 
is obvious to take A(t') = \aq{v). The result is then 



The corresponding entropy function is denoted Iq{f). 

The constant A in flHUl) is not yet determined. Conventionally, it is chosen 
so that Fg{0) = -Fq(l) = 0. This is only possible when q = I. Moreover, Fg{0) 
diverges for q > 2. If g < 2 then it is obvious to choose A so that 



This choice of the function F{u) in (1281) reproduces the Tsallis entropy [3] 
up to two modifications: a change of g by 2 — g and the additional factor 
1/(2 — g) in front of u^~'^. 

7 Variational principle on the level of mi- 
crostates 

Let be given a model with probability distributions of the form and fix one 
microstate x. Then each probability density f{x) defines a function Mxj{(3) 




(30) 




(31) 



by 




(32) 
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Figure 3: Basic property of a convex function. 

See the Figure [3l It is now easy to prove that Mxj{(3) < M^j^^P) for all /3 
for which //^(x) > 0. In other words, the maximum in (l32l) is realised by the 
equilibrium probability distributions of the model. 

The proof goes as follows. The function Fg{u) is convex. Hence, its value 
at the point f{x)/c{x) lies above the straight line which is tangent to the 
function at the point fp{x)/c{x). See the Figure [321 In formulas this gives 

m\ > iiM] + f(^)-M^), (33) 



Use now -^^(m) = Ing(M) in combination with ([7]) to obtain 

F, (^) > (^) + l-a(0) - 0Hi.)]M^. (34) 

\c{x)J \c{x) J c{x) 

This can be written as 

M,j{P)<M,j^{P) = M,{P). (35) 

The inequality ( !32|) holds for all examples, even when the average (i^)/? 
diverges. Take for instance the q = 3-example of the speed of the harmonic 
oscillator. One finds 

M.j{f3) = (l-^)f{v)- ^ I +A-1. (36) 



This expression is maximal when //3(f) is given by (|T8l) . 

If g = 1 then the equilibrium value Mx{(3) is identically zero. The varia- 
tional principle then says that for any probability density f{x) 

- f{x) In + [-a(/5) - PH{x)]f{x) < 0, (37) 

with equality when f{x) = fp^x). 
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Figure 4: determined by taking the supremum of straight hnes. 

8 Proof of the variational principle 

It is now easy to prove the variational principle. Assume that H{x) is 
bounded from below and that the expectation value 

{H)0 = J dxfp{x)H{x) (38) 

converges. Then integration of (l32l) gives 

-cx)< / dxM,j(/?) = /,(/)-«(/?)-/? / dxf{x)H{x). (39) 



The inequality (!35|) implies that (!39l) is maximal when / equals fp. Because 
a{(3) does not depend on / it may be subtracted. The statement then says 
that 

/,(/)-/? j dxfix)H{x). (40) 
is maximal when / equals as given by This is the variational principle. 

9 Legendre transform 

Note that fl40p is a linear function of (3. Hence, it determines a straight line 
in the parameter space. See the Figure |H All these straight lines together 
determine a convex function 

HP) = nfp)-P J dxfp{x)H{x). (41) 
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This is Massieu's function. 

The thermodynamic entropy S{U) is a function of the internal energy U. 
Because the latter is a monotonic function of f3 one can make the identifica- 
tions 

S{U) = I,{fp) and U = {H)fs = J dx fp{x)H{x). (42) 

Then (1411) in combination with (HOl) becomes 

$(/5) = S{U) -f3U = sup{5(f/') - pU'}. (43) 

U' 

In particular, this means that Massieu's function $(/3) is the Legendre trans- 
form of the entropy S{U), as is well known from thermodynamics. An im- 
mediate consequence is 

d$ 

-^-U. (44) 

The inverse Legendre transformation is 

S{U) = mf{<l>{/3') + f3'U}. (45) 

This result automatically implies the well-known formula 

(46) 



10 Dual structure 

The equations (l44l) and ( l46l) are dual relations in the sense of thermodynam- 
ics. The parameter f3 is the dual of the quantity U. Usually, (3 is the inverse 
temperature, which is an 'intensive thermodynamic coordinate', while U is 
the average total energy and is 'extensive'. However, the examples of the 
microcanonical ensemble show that this standard interpretation is specific 
for the canonical ensemble. 

In a mathematical context the same duality between model parameters 
and estimators (averaged quantities used to estimate model parameters) was 
given a geometric interpretation by Amari [21 HI [31] . His a is related to the 
deformation index q hj a = 1 — 2q. The geometric interpretation concerns 
the statistical manifold, the definition of which is given in Section [121 The 
flatness of the statistical manifold is equivalent with the validity of the dual 
relations (H^ [l6l) . Many examples found in the literature on nonextensive 
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thermostatistics involve a curved manifold, which implies that the parameter 
(3 does not satisfy fHUl) and hence, in the context of a canonical ensemble, 
does not coincide with the inverse of the thermodynamic temperature. See for 
instance |2l] and the references quoted there for a discussion about different 
definitions of temperature in nonextensive thermostatistics. 

Amari's work was the basis for the generalisation found in Here, 
these geometrical insights are reviewed in the context of the g-exponential 
family. 



11 Estimating inverse temperature 

In principle, knowledge of the average energy U allows determination of the 
model parameter (3. A measurement of the total energy U may be very ex- 
ceptional. However, one can add extra parameters to the model and measure 
corresponding quantities to estimate these parameters. For simplicity only 
one parameter is considered here. 

The value obtained by experimentally measuring U has some uncertainty. 
It is then obvious to ask how large is the uncertainty on the estimated pa- 
rameter (3. This will depend on how large is the derivative d[//d/3. Indeed, 
if U depends only weakly on (3 then a small error in U leads to a large error 
in the estimated value of (3. Now remember that U is minus the derivative 
of the Massieu function $ — see (jH]). Hence, the relevant quantity is 



d2$ 

This is called the metric tensor. In the case of multiple parameters it is a 
matrix. Because is convex g{[3) is always positive. 

It is known for models belonging to the exponential family that the Fisher 
information matrix is equal to the metric tensor. Below it will be shown that 
this relation can be extended to models belonging to the g-exponential family. 



12 Fisher information 

The g-deformed Fisher information is defined by 

Note that this definition differs from that studied in [HI [9l [IH [El [161 [H] . It 
also differs by a scalar factor from the definition given in [19], [33], because in 
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the latter papers the definition is given in terms of a normahsed escort prob- 
abihty distribution. Here, the normahsation is omitted so that the equahty 
= g{(3) holds without involving a normalisation function. 
In order to prove that Ip = g{P), take the derivative of (JTj). This gives 

<^)y^f^(-) '^-H{x). (49) 



Jpix)J dl3 c(x) d/3 
Combining (HUj) with the definition fHHl) gives 



= 4/ d^M^)- (51) 



But note that 1 = / dx fplx) implies that 

d_ 

dp 

Hence, ( l50!l simplifies to 

h = - j dxH{x)(^^fp{x)y (52) 
In combination with (jHj) and (H71) this yields J/3 = (?(/?). 

13 The statistical manifold 

The statistical manifold is now the map 

It reduces to the log-likelihood function /? ln/^(x)/c(x) in the limit g = 1. 
The tangent vector 



= 7^ Inq 



d/5 " V c{x) 
= (54) 

is the generalised score variable. Its average length is defined by 

\\Xpf= I dxfp{xr{Xp{x)f. (55) 



A short calculation shows that the latter expression equals the Fisher infor- 
mation, i.e. = Ip- 
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14 Final remarks 



In this paper the definition of the generahsed exponential family [T9 l [20 1 133] is 
considered in the context of nonextensive statistical physics, where it is called 
the g-exponential family. Many models of nonextensive statistical physics be- 
long to this family, while others do not because quite often the normalisation 
is written as a prefactor instead of writing it inside the g-exponential. It 
should be stressed that the prefactor c{x) and the Hamiltonian H{x) in the 
r.h.s. of ([5]) must not depend on the parameter (3 while the normalisation 
must not depend on the variable x. 

Several examples of models belonging to the g-exponential family have 
been given. In particular, the g-Gaussians can be written in the required form 
([5]). The g-Gaussian model receives a lot of interest because it appears as 
the central limit of strongly correlated models — see for instance [22l [29l [25l 
[26l [271 \3^ . To my knowledge, the examples concerning the microcanonical 
ensemble appear in the literature for the first time. 

The role of the escort probabilities [13j has not been discussed. But the 
unnormalised escort probabilities fpixY appear prominently, for instance in 
( l55l) . By leaving out the normalisation in the definition ( HHl) of the Fisher 
information the metric tensor g equals the Fisher information, while in [191 
[H3] the normalisation factor enters as a multiplicative factor. 

Only continuous distributions have been considered here. The translation 
towards discrete probability distributions is straigthforward. The transition 
to quantum models requires more attention but is feasible. An early step in 
this direction is found in [21j. In particular, the quantum analogue of ([28jl is 
/(p) = — Tr The prior weights c{x) must be taken all equal before mak- 

ing the transition to quantum mechanics because cyclic permutation under 
the trace is essential — see 

The presentation has been restricted to single parameter models. The ex- 
tension to more than one parameter is obvious. Note that in the mathematics 
literature also non-parametric models are considered [5l[7]. 

Some topics have been left out of the paper. In particular, the relative 
entropy of the Bregman type was not mentioned. Neither was the relation 
between Fisher information and the inequality of Cramer and Rao. Both can 
be found in [19] in the more general context. Finally, note that one can expect 
that the generalisations discussed in the present paper, and, in particular, 
the geometric insight behind them, may lead to powerful applications. 
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Appendix 



For convenience, explicite expressions used in the examples of Section 3 have 
been brought together in the following table 



1 
2 
3 
1 

2 



Inq(it) exp (li) 



In u 



[i-y]- 
1 



[1-2m] 



1/2 



ulnu 

u — Inu + A — l 

2«(|v/^-l) 
\u {\u- - 1) 



The corresponding expressions for the entropy functional Iq{f ) are 



Hf) 
hif) 

hMf) 



dx f{x) In 



c{x) ' 



c(x) 
c(x^ 



1+ y" dxc(a;) (^In-^^^^ 
\- j dxc{x) 
J dx f{x 

-I J Mm) 



A + l 



M 

c(x) 



+ A-1 



- 1 



(56) 
(57) 
(58) 

(59) 
(60) 
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